35 research outputs found
QuesNet: A Unified Representation for Heterogeneous Test Questions
Understanding learning materials (e.g. test questions) is a crucial issue in
online learning systems, which can promote many applications in education
domain. Unfortunately, many supervised approaches suffer from the problem of
scarce human labeled data, whereas abundant unlabeled resources are highly
underutilized. To alleviate this problem, an effective solution is to use
pre-trained representations for question understanding. However, existing
pre-training methods in NLP area are infeasible to learn test question
representations due to several domain-specific characteristics in education.
First, questions usually comprise of heterogeneous data including content text,
images and side information. Second, there exists both basic linguistic
information as well as domain logic and knowledge. To this end, in this paper,
we propose a novel pre-training method, namely QuesNet, for comprehensively
learning question representations. Specifically, we first design a unified
framework to aggregate question information with its heterogeneous inputs into
a comprehensive vector. Then we propose a two-level hierarchical pre-training
algorithm to learn better understanding of test questions in an unsupervised
way. Here, a novel holed language model objective is developed to extract
low-level linguistic features, and a domain-oriented objective is proposed to
learn high-level logic and knowledge. Moreover, we show that QuesNet has good
capability of being fine-tuned in many question-based tasks. We conduct
extensive experiments on large-scale real-world question data, where the
experimental results clearly demonstrate the effectiveness of QuesNet for
question understanding as well as its superior applicability
Applying Deep Learning To Airbnb Search
The application to search ranking is one of the biggest machine learning
success stories at Airbnb. Much of the initial gains were driven by a gradient
boosted decision tree model. The gains, however, plateaued over time. This
paper discusses the work done in applying neural networks in an attempt to
break out of that plateau. We present our perspective not with the intention of
pushing the frontier of new modeling techniques. Instead, ours is a story of
the elements we found useful in applying neural networks to a real life
product. Deep learning was steep learning for us. To other teams embarking on
similar journeys, we hope an account of our struggles and triumphs will provide
some useful pointers. Bon voyage!Comment: 8 page
Learning to Ask: Question-based Sequential Bayesian Product Search
Product search is generally recognized as the first and foremost stage of
online shopping and thus significant for users and retailers of e-commerce.
Most of the traditional retrieval methods use some similarity functions to
match the user's query and the document that describes a product, either
directly or in a latent vector space. However, user queries are often too
general to capture the minute details of the specific product that a user is
looking for. In this paper, we propose a novel interactive method to
effectively locate the best matching product. The method is based on the
assumption that there is a set of candidate questions for each product to be
asked. In this work, we instantiate this candidate set by making the hypothesis
that products can be discriminated by the entities that appear in the documents
associated with them. We propose a Question-based Sequential Bayesian Product
Search method, QSBPS, which directly queries users on the expected presence of
entities in the relevant product documents. The method learns the product
relevance as well as the reward of the potential questions to be asked to the
user by being trained on the search history and purchase behavior of a specific
user together with that of other users. The experimental results show that the
proposed method can greatly improve the performance of product search compared
to the state-of-the-art baselines.Comment: This paper is accepted by CIKM 201
Substantial transition to clean household energy mix in rural China
The household energy mix has significant impacts on human health and climate, as it contributes greatly to many health- and climate-relevant air pollutants. Compared to the well-established urban energy statistical system, the rural household energy statistical system is incomplete and is often associated with high biases. Via a nationwide investigation, this study revealed high contributions to energy supply from coal and biomass fuels in the rural household energy sector, while electricity comprised ∼20%. Stacking (the use of multiple sources of energy) is significant, and the average number of energy types was 2.8 per household. Compared to 2012, the consumption of biomass and coals in 2017 decreased by 45% and 12%, respectively, while the gas consumption amount increased by 204%. Increased gas and decreased coal consumptions were mainly in cooking, while decreased biomass was in both cooking (41%) and heating (59%). The time-sharing fraction of electricity and gases (E&G) for daily cooking grew, reaching 69% in 2017, but for space heating, traditional solid fuels were still dominant, with the national average shared fraction of E&G being only 20%. The non-uniform spatial distribution and the non-linear increase in the fraction of E&G indicated challenges to achieving universal access to modern cooking energy by 2030, particularly in less-developed rural and mountainous areas. In some non-typical heating zones, the increased share of E&G for heating was significant and largely driven by income growth, but in typical heating zones, the time-sharing fraction was <5% and was not significantly increased, except in areas with policy intervention. The intervention policy not only led to dramatic increases in the clean energy fraction for heating but also accelerated the clean cooking transition. Higher income, higher education, younger age, less energy/stove stacking and smaller family size positively impacted the clean energy transition
Intent modeling and automatic query reformulation for search engine systems
Understanding and modeling users' intent in search queries is an important topic in studying search engine systems. Good understanding of search intent is required in order to achieve better search accuracy and better user experience. In this thesis work, I identify and study three major problems in the subject: ambiguous search intent, ineffective query formulation and vague relevance criteria. To systematically study these problems, the thesis consists of three parts. In the first part, I study search intent ambiguity in search engine queries and propose a click pattern-based method that captures ambiguous search intent based on behavioral difference rather than semantic difference. Analysis shows that the proposed method is more accurate and robust in measuring query ambiguity. In the second part, I study how to provide query formulation support to facilitate users in expressing search intent. Query completion and correction, and syntactic query reformulation are proposed and studied in this part. Experiments show that the proposed query formulation support methods can help users formulate more effective queries and alleviate search difficulty. In the third part, I study how to model search intent so that we can gain insights about users' behaviors and leverage the knowledge to improve search engines. Two topics are studied in this part: modeling search intent with data level representation and discovering coordinated shopping intent in product search. It is shown that the proposed methods can not only discover meaningful user intent but also improve search and other related applications. The proposed models and algorithms in the thesis are general and can be applied to improve search accuracy in potentially many different search engines. As a systematic study on intent modeling and automatic query reformulation in search engine systems, this thesis work also provides a road map to future exploration on intent understanding and analysis
Online Spelling Correction for Query Completion
In this paper, we study the problem of online spelling correction for query completions. Misspelling is a common phenomenon among search engines queries. In order to help users effectively express their information needs, mechanisms for automatically correcting misspelled queries are required. Online spelling correction aims to provide spell corrected completion suggestions as a query is incrementally entered. As latency is crucial to the utility of the suggestions, such an algorithm needs to be not only accurate, but also efficient. To tackle this problem, we propose and study a generative model for input queries, based on a noisy channel transformation of the intended queries. Utilizing spelling correction pairs, we train a Markov n-gram transformation model that captures user spelling behavior in an unsupervised fashion. To find the top spellcorrected completion suggestions in real-time, we adapt the A* search algorithm with various pruning heuristics to dynamically expand the search space efficiently. Evaluation of the proposed methods demonstrates a substantial increase in the effectiveness of online spelling correction over existing techniques